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highlight large physics projects, whose configurations are already available on the grid or will 
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ILDG. Statistics about the ILDG is also reported. 
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1. Introduction 

Six years ago at Lattice 2002 the construction of the International Lattice Data Grid was pro- 
posed [jl|]. After extensive work @ |], ^ by many people, the first stage system construction 
completed last year. The ILDG is already used to open data to the public and to share data within 
collaboration. With this situation in mind, we'd like to invite new users to the ILDG. 

For this purpose, we give a brief system overview in sec. ^, describe how to use data on the 
grid in sec. |3[ and highlight major ensembles on the grid in sec. ||, so that one can start to find 
interesting ensembles. We also give statistics about the ILDG in sec. ||. 

2. System overview 

The ILDG bears two aspects, metadata and middleware. Work on the ILDG has been carried 
out by the corresponding working groups. Current members of the working groups are 

• Metadata working group (MDWG): P. Coddington (Adelaide), T. Yoshie (Tsukuba), D. Pleiter 
(DESY), G. Andronico (INFN), C. Maynard (Edinburgh), C. DeTar (Utah), J. Simone (FNAL), 
R. Edwards, B. Joo (JLAB) 

• Middleware working group (MWWG): P. Coddington, S. Zhang (Adelade), T. Amagasa, 
N. Ishii. O. Tatebe. M. Sato (Tsukuba), D. Melkumyan, D. Pleiter (DESY), G. Beckett, 
R. Ostrowski (Edinburgh), J. Simone (FNAL), B. Joo, C. Watson (JLAB) 

In addition, the ILDG board, which consists of one representative from each country, supervises 
two working groups and decides strategic issues. Current members are 

• ILDG board: R. Brower (USA), K. Jansen (Germany), R. Kenway (UK, chair this year), 
D. Leinweber (Australia), O. Pene (France), F. Di Renzo (Italy), A. Ukawa (Japan) 

Figure [l] sketches data and metadata components of the ILDG. Properties (metadata) of an 
ensemble such as physics parameters of the simulation and those of a configuration such as the 
trajectory number are marked up with the QCDml an XML based markup language. The 

ensemble XML and the configuration XML are linked by a unique name of the ensemble called 
"markovChainURI". The configuration XML and a configuration file are linked by a unique name 
of the configuration called "dataLFN" (data Logical File Name). 

Figure ^| summarizes middleware components of the ILDG and data-metadata flow [§]. Con- 
figurations are stored in "Storage Elements", while metadata are stored in the "Metadata Cata- 
logue". Queries on ensemble and configuration metadata are made to the Metadata Catalogue. 
The "File Catalogue" maps the dataLFN to one or more locations for the file, or URLs of the 
file. After you get an URL, you can download the configuration file from one of storage elements. 
Authentication is made before you download. The VOMS (Virtual Organization Membership Ser- 
vice) server maintains the ILDG virtual organization (VO). The ILDG consists of five regional 
grids (RG), CSSM for Australia, JLDG for Japan, LDG (LatFor Data Grid) for continental Europe, 
UKQCD for UK, and USQCD for US. Implementations of storage elements and catalogue services 
are different grid by grid, but RGs are inter-operable with a common interface. 
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Figure 1: Metadata and data components and how they are linked. 
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Figure 2: Middleware components and data-metadata flow. 



3. Using data on ILDG 

Although the ILDG system is a little bit complicated, users don't have to remember details, 
because easy-to-use tools are developed and provided by the MWWG. 
A procedure to use data on the ILDG is as follows. 

1. Join the ILDG VO. To do this, one obtains a grid certificate from a CA (Certificate Authority) 
trusted by the IGTF (International Grid Trust Federation [§]), and visits the ILDG VOMRS 
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(VO Member Registration Service [[7|]) to register. A representative of your regional grid will 
approve your registration request. The certificate is necessary when you download files. 

2. Find interesting ensembles. You can use portals or tools provided by RGs, which can be 
accessed from the ILDG official web page [Eh. Some details will be given below. 

3. Check access policy before you use data. Data on the ILDG are either public or can be used 
after negotiation with the collaboration. The best way to know the policy is to contact the 
collaboration. 

4. Download configurations. One can use a standard command line tool ildg-get provided by 
the MWWG. Note that some RGs support other methods to access data, such as ltools (LDG), 
DiGS tools (UKQCD) and uberftp (JLDG). 

5. Do research and write a paper. Please acknowledge the collaboration by citing papers speci- 
fied by the collaboration and the ILDG web page "http://www.lqcd.org/ildg/". 

In addition to RG portals, a web page [||] prepared for the ILDG tutorial session of this conference, 
which is organized by C. Urbach and C. Allton, provides you with a good starting point. 

In order to help users to find ensembles, RGs provide variety of data browsers and tools. The 
USQCD portal and the LDG portal [JTl]] show you a list of ensembles and details of each 



ensemble. The CSSM portal Q12| ] enables you to search ensembles by specifying actions and other 



physics parameters. The UKQCD ildg-browser [13] supports semantic search based on XML. A 



new JLDG portal, which is still under development and will appear soon in Jl4fl, will support 
narrowing search by faceted navigation. Facets are categories of XML documents, such as names 
of regional grid, collaboration, project etc., and physics parameters. You will be able to specify 
any items in any order for narrowing candidates. See a screen shot in Fig. ||. You can use portals 
without joining the ILDG VO. Please visit and try all portals freely and find your favorite one. 



4. Ensembles on the grid 

This section summarizes major ensembles which are already available or will appear in the 
near future. We have asked several collaborations to list interesting ensembles, and have compiled 
replies. Therefore, what is shown below is not a complete list and may be biased due to our queries. 

In the following tables, we use abbreviations for the action field, Sym: Symanzik, UN: Luescher- 
Weisz, np: non-perturbatively 0(a) improved, tp: tadpole improved, and TM: twisted mass. The 
number of configurations is approximate and the status field indicates status of ensembles by key 
words public: publicly available, negotiable: available but one can use them after negotiation with 
the collaboration, prod.: production run is on-going, and prep.: configurations are in preparation 
and will be available soon. 

4.1 CSSM 

Table [T] lists ensembles on the CSSM grid. The CSSM collaboration has started to accumulate 
Nf = 2 configurations generated with the FLIC fermion action. We hear that they continue param- 
eter tuning for a while and plan to quantify the advantages of the FLIC action for dynamical quark 
simulations. 
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Figure 3: A screen shot of faceted navigation, taken from the new JLDG portal. 
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Table 1: Major ensembles on the CSSM grid. 



The CSSM grid has a 7 TB disk and 20 TB tape system for storage elements. Catalogue 
services are operated at CSSM. 

4.2 JLDG 

Table | shows ensembles on the JLDG. In addition to the CP-PACS Nf = 2 and the CP- 
PACS+JLQCD N f = 2+1 ensembles already public to the world, the Nf = 2 and Nf = 2 + 1 
overlap quark ensembles and the Nf = 2+1 npClover ensembles with very light quark masses will 
be publicly released by the JLQCD collaboration and the PACS-CS collaboration, respectively. 

Storage elements, with 35 TB disk space in total, are distributed over six sites in Japan, 
Tsukuba, KEK, Kyoto, Osaka, Hiroshima and Kanazawa. Catalogue services are operated at CCS, 
Tsukuba. 

4.3 LDG 

For the LatFor data grid, ensembles generated by two major contributors are listed in table [3]. 
The ETM collaboration has carried out Nf = 2 simulations with the Wilson twisted mass quark 
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Table 2: Major ensembles on the JLDG. (*1: will be public, *2: available date not decided yet, *3 will be 
public 6 months after a spectrum paper is submitted.) 



action for three lattice spacings and for a wide range of quark masses. All of these configurations 
are already put on the grid and can be used after negotiation with the collaboration. We hear that 
they will become publicly available probably by the end of this year. The collaboration has also 
started Nf ■ = 2 + 1 + 1 simulations. Nf = 2 configurations from the QCDSF collaboration, also 
cover wide ranges of lattice spacings and quark masses, need negotiation. Nf = 2+1 simulations 
with the SLiNC quark action is on-going. The LDG contains data from other collaborations such 
as SESAM, T#L, GRAL, DIK, and Theta. We hear that the ALPHA collaboration has no plan to 
submit data, and the Bern-Marseilles-Wuppertal collaboration has not made a decision about their 
plans for the ILDG. 

The LatFor data grid is for continental Europe. Storage elements are distributed over 3 sites in 
Germany, DESY (Hamburg+Zeuthen), JSC (Jiilich), ZIB (Berlin), CC-IN2P3 in Lyon, France and 
INFN Parma in Italy. Storage elements have tape back-end without fixed storage quota. The RG 
operates catalogue services at DESY. 

4.4 UKQCD 

The UKQCD grid (see table |j) contains Nf = 2 + 1 domain wall ensembles generated by the 
joint collaboration of the UKQCD and the RBC. 16 3 lattices were publicly available before this 
conference. The collaboration has publicly released configurations on 24 3 lattices this August. 
Simulations on finer lattices are on-going. The grid has also archived the UKQCD asqtad lattices. 
Configurations on the coarser lattice are publicly available, while those on the finer lattice are 
negotiable. 

The UKQCD grid consists of seven sites in UK, Edinburgh, ACF (U. of Edinburgh), Glasgow, 
Liverpool, Swansea, RAL Didcot and Southampton. The grid maintains 80 TB disk space (as of 
March 2007) for storage elements. Catalogue services are operated at EPCC, Edinburgh. 
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Table 3: Major ensembles on the LDG. (*1: become publicly available probably by the end of 2008, *2: 
based on Metadata Catalogue) 
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Table 4: Major ensembles on the UKQCD grid. (*1: public release in this August) 



4.5 USQCD 

Table | summarizes ensembles in the USQCD grid. The MILC collaboration has been gener- 
ating an extensive set of Nf = 2 + 1 ensembles using the asqtad quark action. A remarkable point 
is that the collaboration makes all data public as soon as they are created. Currently, they are gen- 
erating data on large lattices at 0.12 and 0.09 fm, and on much finer lattices. They will be available 



on the grid or the NERSC Gauge Connection web site []15[]. The LHP collaboration is generating 
anisotropic lattices. Nf = 2 data are already publicly available, and Nf = 2 + 1 data are coming 
soon. 

Storage elements of this grid are operated at Fermilab as a part of huge resources, while cata- 
logue services are operated at JLAB. 
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Table 5: Major ensembles on the USQCD grid. (*1: aniso. clover/ tadpole improved Symanzik with no 
rectangle loops in temporal direction.) 
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21 
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93 
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41 



Table 6: Statistics about the ILDG as of June 26, 2008. Data are taken from VOMRS and Metadata Cata- 
logues. Data size does not include file replica. 

5. Statistics 

In order to see how the ILDG is utilized, statistics about the ILDG is summarized in table ||. 

We have 93 ILDG VO members in total. Because the LDG and the UKQCD grid have many 
users, we suppose that they use the ILDG as their primary storage infrastructure. The CSSM grid 
and the USQCD grid have genuine users. The JLDG has only admin users. We hope may Japanese 
users, who still use an old system, will move to the ILDG. 

Number of ensembles stored in the ILDG increases almost linearly since January 2006 and 
have reached 183. We currently have 190 K configurations with total size of m 40 TB. 

6. Conclusions and future work 

The ILDG continues stable operation and has already accumulated a lot of valuable config- 
urations. Usability of the ILDG is improved significantly. The ILDG is becoming an important 
research infrastructure in this community. We hope that many more users join the ILDG and make 
better use of archived data for physics researches. 
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In this report, we have described how to use data on the ILDG. Submitting data to the ILDG 
is a somewhat complicated procedure. Working group members think that making the procedure 
easy is an important future direction. Two working groups are also discussing extensions and 
improvements of the system, such as quark propagator sharing (MDWG) and replication of data 
among regional grids (MWWG). 

I am grateful to all members of the ILDG working groups and the ILDG board, in particular 
to G. Beckett, C. DeTar, and D. Pleiter for helpful suggestions on the manuscript. I also thank 
colleagues who provided us with ensemble information on each regional grid. A part of this work 
is supported by the Grant-in- Aid of the Ministry of Education (No. 18104005 ) of the Japanese 
Government. 
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